On the use of line spectral frequency parameters for speech recognition
نویسنده
چکیده
The line spectral frequency (LSF) representation has been proposed by Itakura [l] as an alternative linear prediction (LP) parametric representation. In the context of speech coding, it has been shown [2-61 that this representation has better quantization properties than the other LP parametric representations (such as log area ratios and reflection coefficients). The LSF representation is capable of reducing the bit-rate by 25-30% for transmitting the LP information without degrading the quality of synthesized speech [4,5]. Our interest in LSF representation has been to see whether we can obtain a similar advantage from this representation for speech recognition. For this, we studied this representation in our earlier paper for the recognition of steady-state vowel frames in the speaker-dependent mode using the minimum distance classifier [7]. Though the LSF representation resulted in good performance [7], the scope of these results was very limited. The aim of the present paper is to extend the use of the LSF representation for more general speech recognition systems and to widen the scope of its results. (Some of these results have been reported earlier in a conference [8].) For this, we study here this representation in both the speaker-dependent and the speaker-independent modes for the hidden Markov model (HMM)-based isolated word recognition systems. Since the HMM-based speech recognizers use the maximum likelihood decision rule for recognition, we also report here the results for the speaker-&pendent and the speaker-independent vowel recognition experiments using the maximum likelihood classifier, In the present paper, we compare the performance of
منابع مشابه
Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants
Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملLSP weighting functions based on spectral sensitivity and mel-frequency warping for speech recognition in digital communication
In digital communication networks, a speech recognition system extracts feature parameters after reconstructing speech signals. In this paper, we consider a useful approach of incorporating speech coding parameters into a speech recognizer. Most speech coders employ line spectrum pairs (LSPs) to represent spectral parameters. We introduce weighted distance measures to improve the recognition pe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Digital Signal Processing
دوره 2 شماره
صفحات -
تاریخ انتشار 1992